Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Guiding Novice Web Workers in Making Image Descriptions Using Templates

Published: 19 November 2015 Publication History

Abstract

This article compares two methods of employing novice Web workers to author descriptions of science, technology, engineering, and mathematics images to make them accessible to individuals with visual and print-reading disabilities. The goal is to identify methods of creating image descriptions that are inexpensive, effective, and follow established accessibility guidelines. The first method explicitly presented the guidelines to the worker, then the worker constructed the image description in an empty text box and table. The second method queried the worker for image information and then used responses to construct a template-based description according to established guidelines. The descriptions generated through queried image description (QID) were more likely to include information on the image category, title, caption, and units. They were also more similar to one another, based on Jaccard distances of q-grams, indicating that their word usage and structure were more standardized. Last, the workers preferred describing images using QID and found the task easier. Therefore, explicit instruction on image-description guidelines is not sufficient to produce quality image descriptions when using novice Web workers. Instead, it is better to provide information about images, then generate descriptions from responses using templates.

References

[1]
Benetech. 2012. POET image description tool. Retrieved October 10, 2015, from http://diagramcenter.org/development/poet.html.
[2]
Benetech and Touch Graphics. 2014. Decision tree: Image sorting tool. Retrieved October 10, 2015, from http://diagramcenter.org/decision-tree.html.
[3]
Tim Berners-Lee, James Hendler, and Ora Lassila. 2001. The Semantic Web. Scientific American 284, 5, 28--37.
[4]
Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samual White, and Tom Yeh. 2010a. VizWiz: Nearly real-time answers to visual questions. In Proceedings of the 23rd Annual Symposium on User Interface Software and Technology. ACM, New York, NY, 333--342.
[5]
Jeffrey P. Bigham, Chandrika Jayant, Andrew Miller, Brandyn White, and Tom Yeh. 2010b. VizWiz: LocateIt-enabling blind people to locate objects in their environment. In Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’10). IEEE, Los Alamitos, CA, 65--72.
[6]
Jeffrey P. Bigham, Richard E. Ladner, and Yevgen Borodin. 2011. The design of human-powered access technology. In Proceedings of the 13th International Conference on Computers and Accessibility (SIGACCESS’11). ACM, New York, NY, 3--10.
[7]
Rune Haubo Bojesen Christensen, Hye-Seong Lee, and Per Bruun Brockhoff. 2012. Estimation of the Thurstonian model for the 2-AC protocol. Food Quality and Preference 24, 1, 119--128.
[8]
Leonid Boytsov. 2011. Indexing methods for approximate dictionary searching: Comparative analysis. Journal of Experimental Algorithmics 16, 1.
[9]
Sandra Carberry, Stephanie Elzer Schwartz, Kathleen Mccoy, Seniz Demir, Peng Wu, Charles Greenbacker, Daniel Chester, Edward Schwartz, David Oliver, and Priscilla Moraes. 2012. Access to multimodal articles for individuals with sight impairments. ACM Transactions on Interactive Intelligent Systems 2, 4, 21.
[10]
Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, and Rajeev Motwani. 2003. Robust and efficient fuzzy match for online data cleaning. In Proceedings of the International Conference on Management of Data (SIGMOD’03). ACM, New York, NY, 313--324.
[11]
Daniel Dardailler. 1997. The ALT-Server (“An Eye for an Alt”). Retrieved October 10, 2015, from http://www.w3.org/WAI/altserv.htm.
[12]
Seniz Demir, Sandra Carberry, and Kathleen F. McCoy. 2012. Summarizing information graphics textually. Computational Linguistics 38, 3, 527--574.
[13]
Seniz Demir, David Oliver, Edward Schwartz, Stephanie Elzer, Sandra Carberry, Kathleen F. Mccoy, and Daniel Chester. 2010. Interactive SIGHT: Textual access to simple bar charts. New Review of Hypermedia and Multimedia 16, 3, 245--279.
[14]
Seniz Demir, Stephanie Elzer Schwartz, Richard Burns, and Sandra Carberry. 2013. What is being measured in an information graphic? In Computational Linguistics and Intelligent Text Processing. Springer, 501--512.
[15]
Michel Dumontier, Leo Ferres, and Natalia Villanueva-Rosales. 2010. Modeling and querying graphical representations of statistical data. Web Semantics: Science, Services and Agents on the World Wide Web 8, 2, 241--254.
[16]
Stephanie Elzer, Sandra Carberry, Ingrid Zukerman, Daniel Chester, Nancy Green, and Seniz Demir. 2005. A probabilistic framework for recognizing intention in information graphics. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 19. 1042.
[17]
Massimo Fasciano and Guy Lapalme. 1996. Postgraphe: A system for the generation of statistical graphics and text. In Proceedings of the 8th International Workshop on Natural Language Generation (INLG’96). 51--60.
[18]
Yansong Feng and Mirella Lapata. 2010. How many words is a picture worth? Automatic caption generation for news images. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 1239--1249.
[19]
Leo Ferres, Gitte Lindgaard, Livia Sumegi, and Bruce Tsuji. 2013. Evaluating a tool for improving accessibility to charts and graphs. ACM Transactions on Computer-Human Interaction 20, 5, 28.
[20]
Bryan Gould, Trisha O’Connell, and Geoffrey Freed. 2008. Guidelines for describing STEM images. Retrieved October 10, 2015, from http://ncam.wgbh.org/experience_learn/educational_media/stemdx/guidelines.
[21]
Chandrika Jayant, Matt Renzelmann, Dana Wen, Satria Krisnandi, Richard Ladner, and Dan Comden. 2007. Automated tactile graphics translation: In the field. In Proceedings of the 9th International Conference on Computers and Accessibility (SIGACCESS’07). ACM, New York, NY, 75--82.
[22]
Geoffrey Keppel and Thomas D. Wickens. 2004. Design and Analysis: A Researcher’s Handbook (4 ed.). Pearson Education, Upper Saddle River, NJ.
[23]
Richard E. Ladner, Melody Y. Ivory, Rajesh Rao, Sheryl Burgstahler, Dan Comden, Sangyun Hahn, Matthew Renzelmann, Satria Krisnandi, Mahalakshmi Ramasamy, Beverly Slabosky, Andrew Martin, Amelia Lacenski, Stuart Olsen, and Dmitri Groce. 2005. Automating tactile graphics translation. In Proceedings of the 7th International Conference on Computers and Accessibility (SIGACCESS’05). ACM, New York, NY, 150--157.
[24]
Walter Lasecki, Christopher Miller, Adam Sadilek, Andrew Abumoussa, Donato Borrello, Raja Kushalnagar, and Jeffrey Bigham. 2012. Real-time captioning by groups of non-experts. In Proceedings of the 25th Annual Symposium on User Interface Software and Technology. ACM, New York, NY, 23--34.
[25]
LimeSurvey Project Team/Carsten Schmitz. 2012. LimeSurvey: An Open Source Survey Tool. LimeSurvey Project, Hamburg, Germany. http://www.limesurvey.org.
[26]
Kathleen F. McCoy, Sandra Carberry, Tom Roper, and Nancy Green. 2001. Towards generating textual summaries of graphs. In Proceedings of the International Conference on Universal Access in Human-Computer Interaction. 695--699.
[27]
R Core Team. 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org.
[28]
Daisuke Sato, Masatomo Kobayashi, Hironobu Takagi, and Chieko Asakawa. 2010. Social accessibility: The challenge of improving Web accessibility through collaboration. In Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A’10). ACM, New York, NY, 28.
[29]
Hironobu Takagi, Susumu Harada, Daisuke Sato, and Chieko Asakawa. 2013. Lessons learned from crowd accessibility services. In Human-Computer Interaction—INTERACT 2013. Springer, 587--604.
[30]
Esko Ukkonen. 1992. Approximate string-matching with q-grams and maximal matches. Theoretical Computer Science 92, 1, 191--211.
[31]
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2014. Show and tell: A neural image caption generator. arXiv preprint arXiv:1411.4555.
[32]
Luis Von Ahn and Laura Dabbish. 2004. Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 319--326.
[33]
Luis Von Ahn, Shiry Ginosar, Mihir Kedia, Ruoran Liu, and Manuel Blum. 2006. Improving accessibility of the Web with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 79--82.
[34]
Peng Wu, Sandra Carberry, Stephanie Elzer, and Daniel Chester. 2010. Recognizing the intended message of line graphs. In Diagrammatic Representation and Inference. Springer, 220--234.

Cited By

View all
  • (2024)Evaluating the Effectiveness of STEM Images CaptioningProceedings of the 21st International Web for All Conference10.1145/3677846.3677863(150-159)Online publication date: 13-May-2024
  • (2024)MAIDR Meets AI: Exploring Multimodal LLM-Based Data Visualization Interpretation by and with Blind and Low-Vision UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675660(1-31)Online publication date: 27-Oct-2024
  • (2024)Context-Aware Image Descriptions for Web AccessibilityProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675658(1-17)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. Guiding Novice Web Workers in Making Image Descriptions Using Templates

      Recommendations

      Reviews

      William Brinkman

      Making science, technology, engineering, and math (STEM) education available to all who are capable of learning is a moral imperative. Yet our educational system (including authors, textbook publishers, and college professors) struggles to provide appropriate access to the images and figures that are critical to STEM subject learning. Morash et al. envision a system that would allow nonexpert workers (recruited through a service such as Amazon's Mechanical Turk) to create high-quality accessible descriptions (also known as alt-text) of STEM images. The success of such a system could greatly reduce the cost of making STEM teaching materials accessible, and thereby greatly increase access to STEM education for people with visual impairments. Their main contribution is to demonstrate that the design of the system's user interface influences the completeness and uniformity of the resulting alt-text. Current web-based systems for this problem simply present the worker with an image and a set of instructions, and allow the worker to enter his or her description as free text. The authors have created a competing system (which they call a queried image description, QID) that uses an interactive survey tool to gather information from the worker, and then auto-generates the image description using a template. Web workers using QID are significantly less likely to omit key information (like captions, or units on graphs) than those using free text entry. There is also significantly less variation in descriptions generated by different workers when using QID as compared to free text entry, which should simplify quality control and reduce user confusion. While there is a well-founded hope that QID-generated alt-text will be more usable than free text, and comparable to alt-text created by experts, such usability testing is left as future work. Another notable aspect of this paper is the bringing together of "greatest hits" from several different areas of computer science research. Ukkoken's approximate string matching, Von Ahn et al.'s image labeling, and the Jaccard coefficient are all ideas that graduate students should see. This paper could therefore be the starting point for a nice seminar course. These ideas should find widespread adoption in the future, if it can be shown that such a system generates alt-text at a level of quality comparable to an expert. The pressures on colleges and textbook publishers to make STEM education accessible are intense, and this approach has the potential to solve one of the major barriers to doing so. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Accessible Computing
      ACM Transactions on Accessible Computing  Volume 7, Issue 4
      November 2015
      77 pages
      ISSN:1936-7228
      EISSN:1936-7236
      DOI:10.1145/2847216
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 November 2015
      Accepted: 01 April 2015
      Revised: 01 April 2015
      Received: 01 September 2014
      Published in TACCESS Volume 7, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Accessibility (blind and visually impaired)
      2. access technology
      3. crowdsourcing
      4. human computation
      5. image description

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • Office of Special Education Programs
      • Department of Education
      • Cooperative Agreement
      • Benetechs DIAGRAM Center initiative

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)196
      • Downloads (Last 6 weeks)34
      Reflects downloads up to 01 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Evaluating the Effectiveness of STEM Images CaptioningProceedings of the 21st International Web for All Conference10.1145/3677846.3677863(150-159)Online publication date: 13-May-2024
      • (2024)MAIDR Meets AI: Exploring Multimodal LLM-Based Data Visualization Interpretation by and with Blind and Low-Vision UsersProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675660(1-31)Online publication date: 27-Oct-2024
      • (2024)Context-Aware Image Descriptions for Web AccessibilityProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675658(1-17)Online publication date: 27-Oct-2024
      • (2024)FigurA11y: AI Assistance for Writing Scientific Alt TextProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645212(886-906)Online publication date: 18-Mar-2024
      • (2024)Natural Language Dataset Generation Framework for Visualizations Powered by Large Language ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642943(1-22)Online publication date: 11-May-2024
      • (2024)MAIDR: Making Statistical Visualizations Accessible with Multimodal Data RepresentationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642730(1-22)Online publication date: 11-May-2024
      • (2024)Designing Unobtrusive Modulated Electrotactile Feedback on Fingertip Edge to Assist Blind and Low Vision (BLV) People in Comprehending ChartsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642546(1-20)Online publication date: 11-May-2024
      • (2024)“It’s Kind of Context Dependent”: Understanding Blind and Low Vision People’s Video Accessibility Preferences Across Viewing ScenariosProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642238(1-20)Online publication date: 11-May-2024
      • (2023)WATAA: Web Alternative Text Authoring Assistant for Improving Web Content AccessibilityCompanion Proceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581754.3584127(41-45)Online publication date: 27-Mar-2023
      • (2023)The Accessibility of Data Visualizations on the Web for Screen Reader Users: Practices and Experiences During COVID-19ACM Transactions on Accessible Computing10.1145/355789916:1(1-29)Online publication date: 29-Mar-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media