research-article

Open access

Audio Description Customization

Authors:

Rosiana Natalie,

Ruei-Che Chang,

Smitha Sheshadri,

Kotaro HaraAuthors Info & Claims

ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

Article No.: 39, Pages 1 - 19

https://doi.org/10.1145/3663548.3675617

Published: 27 October 2024 Publication History

All formats PDF

Abstract

Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals’ potentially diverse needs and preferences. This research investigates if customizing AD could improve how BLV individuals consume videos. We conducted an interview study (Study 1) with fifteen BLV participants, which revealed desires for customizing properties like length, emphasis, speed, voice, format, tone, and language. At the same time, concerns like interruptions and increased interaction load due to customization emerged. To examine AD customization’s effectiveness and tradeoffs, we designed CustomAD, a prototype that enables BLV users to customize AD content and presentation. An evaluation study (Study 2) with twelve BLV participants showed using CustomAD significantly enhanced BLV people’s video understanding, immersion, and information navigation efficiency. Our work illustrates the importance of AD customization and offers a design that enhances video accessibility for BLV individuals.

Supplemental Material

ZIP File

Appendix_CurrentVersion_Accessible_CameraReady.pdf = Supplementary Material in PDF format that consists of the Appendix related to the paper. We put the references of the paper on which appendix to refer to in the document.

Download
75.42 MB

References

[1]

3PlayMedia. 2020. Beginner’s Guide to Audio Description. https://go.3playmedia.com/hubfs/WP%20PDFs/Beginners-Guide-to-Audio-Description.pdf. Accessed: 2021-01-13.

[2]

G. Abla, E.N. Kim, D.P. Schissel, and S.M. Flanagan. 2010. Customizable scientific web portal for fusion research. Fusion Engineering and Design 85, 3 (2010), 603–607. https://doi.org/10.1016/j.fusengdes.2010.02.030 Proceedings of the 7th IAEA Technical Meeting on Control, Data Acquisition, and Remote Participation for Fusion Research.

[3]

NV Access. 2023. NVDA Version 2023.3.4. https://www.nvaccess.org/download/. Accessed: 2023-09-14.

[4]

ADLab. 2024. ADLab Audio Descriptioon: Lifeling Access for the Blind. http://www.adlabproject.eu/Docs/adlab%20book/. Accessed: 2024-03-14.

[5]

Seeing AI. 2024. Seeing AI: Talking Camera for the Blind. https://www.seeingai.com/. Accessed: 2024-09-24.

[6]

Amazon. 2023. Amazon Prime. http://www.amazon.com. Accessed: 2023-09-14.

[7]

Audio Description Project American Council of the Blind. 2017. Guideline for Audio Describers. https://www.acb.org/adp/guidelines.html. Accessed: 2020-11-6.

[8]

Apple. 2023. Chapter 1. Introducing VoiceOver. https://www.apple.com/voiceover/info/guide/_1121.html. Accessed: 2023-09-14.

[9]

Vision Australia. 2015. Audio Description on TV. https://youtu.be/ULgLZn91TMo. Accessed: 2023-11-6.

[10]

Michael A. Beam and Gerald M. Kosicki. 2014. Personalized News Portals: Filtering Systems and Increased News Exposure. Journalism & Mass Communication Quarterly 91, 1 (2014), 59–77. https://doi.org/10.1177/1077699013514411 arXiv:https://doi.org/10.1177/1077699013514411

[11]

Raju Shrestha Bineeth Kuriakose and Frode Eika Sandnes. 2022. Tools and Technologies for Blind and Visually Impaired Navigation Support: A Review. IETE Technical Review 39, 1 (2022), 3–18. https://doi.org/10.1080/02564602.2020.1819893 arXiv:https://doi.org/10.1080/02564602.2020.1819893

[12]

Australian Communications Consumer Action Network Media Access Australia Blind Citizen Australia, Vision Australia. 2024. Blindness Sector Report on the 2012 ABC Audio Description Trial. https://www.bca.org.au/wp- content/uploads/2018/04/Blindness_Sector_Report_on_the_2012_ABC_Audio_Description_Trial.doc. Accessed: 2024-03-14.

[13]

Danielle Bragg, Katharina Reinecke, and Richard E. Ladner. 2021. Expanding a Large Inclusive Study of Human Listening Rates. ACM Trans. Access. Comput. 14, 3, Article 12 (jul 2021), 26 pages. https://doi.org/10.1145/3461700

Digital Library

[14]

Northern German Broadcasting. 2023. Audio description guidelines. https://www.ndr.de/fernsehen/barrierefreie_angebote/audiodeskription/Audio-description-guidelines,audiodeskription142.html. Accessed: 2023-04-09.

[15]

Andrea Bunt, Cristina Conati, and Joanna McGrenere. 2007. Supporting interface customization using a mixed-initiative approach. In Proceedings of the 12th International Conference on Intelligent User Interfaces (Honolulu, Hawaii, USA) (IUI ’07). Association for Computing Machinery, New York, NY, USA, 92–101. https://doi.org/10.1145/1216295.1216317

Digital Library

[16]

Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg Vanderheiden, Wendy Chisholm, John Slatin, and Jason White. 2008. Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C) (2008).

[17]

Virginia P Campos, Tiago MU de Araújo, Guido L de Souza Filho, and Luiz MG Gonçalves. 2020. CineAD: a system for automated audio description script generation for the visually impaired. Universal Access in the Information Society 19, 1 (2020), 99–111. https://doi.org/10.1007/s10209-018-0634-4

[18]

Media Access Canada. 2023. DESCRIPTIVE VIDEO PRODUCTION AND PRESENTATION BEST PRACTICES GUIDE FOR DIGITAL ENVIRONMENTS. http://www.mediac.ca/DVBPGDE_V2_28Feb2012.asp. Accessed: 2023-04-09.

[19]

EndTheCycle CBM. 2023. My Story: Maria (with Extended Audio Description). https://youtu.be/JUIJ_aNxsG8. Accessed: 2023-11-6.

[20]

Ruei-Che Chang, Chao-Hsien Ting, Chia-Sheng Hung, Wan-Chen Lee, Liang-Jin Chen, Yu-Tzu Chao, Bing-Yu Chen, and Anhong Guo. 2022. OmniScribe: Authoring Immersive Audio Descriptions for 360° Videos. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 15, 14 pages. https://doi.org/10.1145/3526113.3545613

Digital Library

[21]

Agnieszka Chmiel and Iwona Mazur. 2016. Researching preferences of audio description users—Limitations and solutions. Across Languages and Cultures 17, 2 (2016), 271–288. https://doi.org/10.1556/084.2016.17.2.7

[22]

Agnieszka Chmiel and Iwona Mazur. 2022. A homogenous or heterogeneous audience? Audio description preferences of persons with congenital blindness, non-congenital blindness and low vision. Perspectives 30, 3 (2022), 552–567. https://doi.org/10.1080/0907676X.2021.1913198 arXiv:https://doi.org/10.1080/0907676X.2021.1913198

[23]

Lacey Colligan, Henry WW Potts, Chelsea T Finn, and Robert A Sinkin. 2015. Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record. International journal of medical informatics 84, 7 (2015), 469–476. https://doi.org/10.1016/j.ijmedinf.2015.03.003

[24]

Federal Communications Commission. 2020. 21st Century Communications and Video Accessibility Act (CVAA). https://www.fcc.gov/consumers/guides/21st-century-communications-and-video-accessibility-act-cvaa. Accessed: 2020-11-6.

[25]

Barry J Cronin and Sharon Robertson King. 1990. The Development of the Descriptive Video Servicesm. Journal of Visual Impairment & Blindness 84, 10 (1990), 503–506.

[26]

Ionut Damian, Birgit Endrass, Peter Huber, Nikolaus Bee, and Elisabeth André. 2011. Individualized Agent Interactions. In Motion in Games, Jan M. Allbeck and Petros Faloutsos (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 15–26.

[27]

Vagner Figueredo de Santana, Rosimeire de Oliveira, Leonelo Dell Anhol Almeida, and Marcia Ito. 2013. Firefixia: an accessibility web browser customization toolbar for people with dyslexia. In Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility (Rio de Janeiro, Brazil) (W4A ’13). Association for Computing Machinery, New York, NY, USA, Article 16, 4 pages. https://doi.org/10.1145/2461121.2461137

Digital Library

[28]

Described and Captioned Media Program. 2020. Described and Captioned Media Program (DCMP). http://www.descriptionkey.org/quality_description.html. Accessed: 2019-03-19.

[29]

Disney+. 2023. Disney+. http://www.disneyplus.com. Accessed: 2023-09-14.

[30]

Jane Doe. 2016. Audio Description - Full Clip. https://youtu.be/7-XOHN2BWG4. Accessed: 2023-11-6.

[31]

James W Drisko and Tina Maschi. 2016. Content analysis. Oxford University Press, USA.

[32]

Be My Eyes. 2024. Be My Eyes. https://www.bemyeyes.com/. Accessed: 2024-09-24.

[33]

USDA Food and Nutrition Service. 2022. CACFP Cooking Video: Cheesy Bean Tostada Ages 6-18, with Audio Description. https://youtu.be/9H8Ch1tcaCs. Accessed: 2023-11-6.

[34]

American Foundation for the Blind. 2024. BeSpecular: A New Remote Assitant Service. https://www.afb.org/aw/17/7/15313. Accessed: 2024-09-24.

[35]

Louise Fryer. 2016. An introduction to audio description: A practical guide. Routledge.

[36]

L. Gagnon, C. Chapdelaine, D. Byrns, S. Foucher, M. Héritier, and V. Gupta. 2010. A computer-vision-assisted system for Videodescription scripting. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops. 41–48. https://doi.org/10.1109/CVPRW.2010.5543575

[37]

Krzysztof Gajos and Daniel S. Weld. 2004. SUPPLE: automatically generating user interfaces. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 93–100. https://doi.org/10.1145/964442.964461

Digital Library

[38]

Sony Global. 2023. Sony’s Purpose (with Audio Description) | Official Video. https://youtu.be/7Tiem2QBS0U. Accessed: 2023-11-6.

[39]

Leo A Goodman. 1961. Snowball sampling. The annals of mathematical statistics (1961), 148–170.

[40]

Google. 2023. The Latest YouTube Stats on When, Where, and What people watch. https://www.thinkwithgoogle.com/consumer-insights/consumer-trends/youtube-stats-video-consumption-trends/. Accessed: 2023-09-12.

[41]

Joan Greening and Deborah Rolph. 2007. Accessibility: raising awareness of audio description in the UK. In Media for All. Brill, 127–138.

[42]

Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. 2020. Captioning Images Taken by People Who Are Blind. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 417–434.

Digital Library

[43]

Kari Halsted and James Roberts. 2002. Eclipse help system: an open source user assistance offering. In Proceedings of the 20th annual international conference on Computer documentation. 49–59. https://doi.org/10.1145/584955.584964

Digital Library

[44]

Tania Heap, Regina Kaplan-Rakowski, and Audon Archibald. 2023. Experiencing Virtual Reality for Perspective-Taking of Blind and Visually Impaired Learners. Available at SSRN 4595370 (2023).

[45]

Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, and Anhong Guo. 2023. Hacking, Switching, Combining: Understanding and Supporting DIY Assistive Technology Design by Blind People. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 57, 17 pages. https://doi.org/10.1145/3544548.3581249

Digital Library

[46]

Daniel J Hruschka, Deborah Schwartz, Daphne Cobb St. John, Erin Picone-Decaro, Richard A Jenkins, and James W Carey. 2004. Reliability in coding open-ended data: Lessons learned from HIV behavioral research. Field methods 16, 3 (2004), 307–331. https://doi.org/10.1177/1525822X04266540

[47]

Mina Huh, Yi-Hao Peng, and Amy Pavel. 2023. GenAssist: Making Image Generation Accessible. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 38, 17 pages. https://doi.org/10.1145/3586183.3606735

Digital Library

[48]

Mina Huh, Saelyne Yang, Yi-Hao Peng, Xiang ’Anthony’ Chen, Young-Ho Kim, and Amy Pavel. 2023. AVscript: Accessible Video Editing with Audio-Visual Scripts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 796, 17 pages. https://doi.org/10.1145/3544548.3581494

Digital Library

[49]

Amy Hurst, Scott E. Hudson, and Jennifer Mankoff. 2007. Dynamic detection of novice vs. skilled use without a task model. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association for Computing Machinery, New York, NY, USA, 271–280. https://doi.org/10.1145/1240624.1240669

Digital Library

[50]

IMSTVUK. 2013. Frozen - Trailer with Audio Description). https://youtu.be/O7j4_aP8dWA. Accessed: 2023-11-6.

[51]

World-Wide Web COnsortium Web Accessibility Initiative. 2023. Making the Web-Accessible. https://www.w3.org/WAI/. Accessed: 2023-11-6.

[52]

Lucy Jiang, Crescentia Jung, Mahika Phutane, Abigale Stangl, and Shiri Azenkot. 2024. “It’s Kind of Context Dependent”: Understanding Blind and Low Vision People’s Video Accessibility Preferences Across Viewing Scenarios. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 897, 20 pages. https://doi.org/10.1145/3613904.3642238

Digital Library

[53]

Lucy Jiang and Richard Ladner. 2022. Co-Designing Systems to Support Blind and Low Vision Audio Description Writers. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (Athens, Greece) (ASSETS ’22). Association for Computing Machinery, New York, NY, USA, Article 74, 3 pages. https://doi.org/10.1145/3517428.3550394

Digital Library

[54]

Lucy Jiang, Mahika Phutane, and Shiri Azenkot. 2023. Beyond Audio Description: Exploring 360° Video Accessibility with Blind and Low Vision Users Through Collaborative Creation. In Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility (New York, NY, USA) (ASSETS ’23). Association for Computing Machinery, New York, NY, USA, Article 50, 17 pages. https://doi.org/10.1145/3597638.3608381

Digital Library

[55]

Patrick W Jordan, Bruce Thomas, Ian Lyall McClelland, and Bernard Weerdmeester. 1996. Usability evaluation in industry. CRC Press.

[56]

Ivana Katsarova. 2018. The audiovisual media services directive. Briefing EU [Legislation in Progress,] European Parliament (2018).

[57]

Irina Kegishyan. 2023. Mobile Video Statistics. https://www.yansmedia.com/blog/mobile-video-statistics. Accessed: 2023-04-09.

[58]

Heather Kennedy-Eden and Ulrike Gretzel. 2012. A Taxonomy of Mobile Applications in Tourism. e-Review of Tourism Research (eRTR) 10 (01 2012), 47–50.

[59]

Em’s Kitchen. 2021. 3 Ingredient Nutella Mug Cake 2 Ways. https://youtu.be/sItYaC1z_d0. Accessed: 2023-11-6.

[60]

Masatomo Kobayashi, Kentarou Fukuda, Hironobu Takagi, and Chieko Asakawa. 2009. Providing synthesized audio description for online videos. In Proceedings of the 11th International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, Pennsylvania, USA) (ASSETS ’09). Association for Computing Machinery, New York, NY, USA, 249–250. https://doi.org/10.1145/1639642.1639699

Digital Library

[61]

Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, and Mohit Bansal. 2020. MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 2603–2614. https://doi.org/10.18653/v1/2020.acl-main.233

[62]

Hoi Ching Dawning Leung. 2018. Audio description of audiovisual programmes for the visually impaired in Hong Kong. Ph. D. Dissertation. UCL (University College London).

[63]

Xingyu Liu, Patrick Carrington, Xiang ’Anthony’ Chen, and Amy Pavel. 2021. What Makes Videos Accessible to Blind and Visually Impaired People?. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 272, 14 pages. https://doi.org/10.1145/3411764.3445233

Digital Library

[64]

Alice Lo Valvo, Daniele Croce, Domenico Garlisi, Fabrizio Giuliano, Laura Giarré, and Ilenia Tinnirello. 2021. A navigation and augmented reality system for visually impaired people. Sensors 21, 9 (2021), 3061. https://doi.org/10.3390/s21093061

[65]

Mariana Lopez, Gavin Kearney, and Krisztián Hofstädter. 2018. Audio Description in the UK: What works, what doesn’t, and understanding the need for personalising access. British journal of visual impairment 36, 3 (2018), 274–291.

[66]

Microsoft. 2024. Miscrosoft Soundscape: A map delivered in 3D Sound. https://www.microsoft.com/en-us/research/product/soundscape/. Accessed: 2024-09-24.

[67]

Mario Montagud, Pilar Orero, and Anna Matamala. 2020. Culture 4 all: accessibility-enabled cultural experiences through immersive VR360 content. Personal and Ubiquitous Computing 24, 6 (2020), 887–905. https://doi.org/10.1007/s00779-019-01357-3

Digital Library

[68]

Sarah Morley. 1998. Digital talking books on a PC: a usability evaluation of the prototype DAISY playback software. In Proceedings of the Third International ACM Conference on Assistive Technologies (Marina del Rey, California, USA) (Assets ’98). Association for Computing Machinery, New York, NY, USA, 157–164. https://doi.org/10.1145/274497.274527

Digital Library

[69]

Rosiana Natalie, Ebrima Jarjue, Hernisa Kacorri, and Kotaro Hara. 2020. ViScene: A Collaborative Authoring Tool for Scene Descriptions in Videos. In Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, New York, NY, USA, Article 87, 4 pages. https://doi.org/10.1145/3373625.3418030

Digital Library

[70]

Rosiana Natalie, Jolene Loh, Huei Suen Tan, Joshua Tseng, Ian Luke Yi-Ren Chan, Ebrima H Jarjue, Hernisa Kacorri, and Kotaro Hara. 2021. The Efficacy of Collaborative Authoring of Video Scene Descriptions. In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article 17, 15 pages. https://doi.org/10.1145/3441852.3471201

Digital Library

[71]

Rosiana Natalie, Joshua Tseng, Hernisa Kacorri, and Kotaro Hara. 2023. Supporting Novices Author Audio Descriptions via Automatic Feedback. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 77, 18 pages. https://doi.org/10.1145/3544548.3581023

Digital Library

[72]

Netflix. 2020. Audio Description Style Guide v2.1. https://partnerhelp.netflixstudios.com/hc/en-us/articles/215510667-Audio-Description-Style-Guide-v2-1. Accessed: 2020-11-6.

[73]

Netflix. 2023. Netflix. http://www.netflix.com. Accessed: 2023-09-14.

[74]

Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, and Haibin Ling. 2022. Expanding Language-Image Pretrained Models for General Video Recognition. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 1–18.

[75]

Amy Pavel, Gabriel Reyes, and Jeffrey P. Bigham. 2020. Rescribe: Authoring and Automatically Editing Audio Descriptions. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 747–759. https://doi.org/10.1145/3379337.3415864

Digital Library

[76]

Peter Pawlowski. 2010. Basic Player Whose Appearance and Functions can be Customized Freely ‘Foobar 2000’v1. 0 is Unveiled. Windows Forest, Japan, Jan 12 (2010), 3.

[77]

Pictory. 2023. What are the Most Popular Genres on YouTube in 2023?https://pictory.ai/blog/what-are-the-most-popular-genres-on-youtube-in-2023?el=0035&htrafficsource=pictoryblog&hcategory=video. Accessed: 2024-04-18.

[78]

Able Player. 2020. Able Player: Fuly Accessible cross-browser HTML Media Player. https://www.3playmedia.com/services/features/plugins/3play-plugin/. Accessed: 2020-11-6.

[79]

Antoine Ponsard and Joanna McGrenere. 2016. Anchored Customization: Anchoring Settings to the Application Interface to Afford Customization. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 4154–4165. https://doi.org/10.1145/2858036.2858129

Digital Library

[80]

Audio Description Project. 2023. Recommendation of the Federal Communications Commission disability...https://adp.acb.org/docs/DAC%20Recommendation%20on%20Audo%20Description%20Quality%20Adopted%20October%2014%202020.pdf

[81]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.

[82]

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.

[83]

Batul Saati, May Salem, and Willem-Paul Brinkman. 2005. Towards customized user interface skins: investigating user personality and skin colour. Proceedings of HCI 2005 2 (2005), 89–93.

[84]

Freedom Scientific. 2023. JAWS. https://www.freedomscientific.com/products/software/jaws/. Accessed: 2023-09-14.

[85]

Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. "Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376404

Digital Library

[86]

Abigale Stangl, Nitin Verma, Kenneth R. Fleischmann, Meredith Ringel Morris, and Danna Gurari. 2021. Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision. In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article 16, 15 pages. https://doi.org/10.1145/3441852.3471233

Digital Library

[87]

Rena Strober. 2020. Imagine That! Music video with AUDIO DESCRIPTION. https://youtu.be/UXz9AtO_kl0. Accessed: 2023-11-6.

[88]

Walt Disney Animation Studios. 2013. Disney’s Frozen "Party Is Over" Clip). https://youtu.be/jNuZC5_9pQQ. Accessed: 2023-11-6.

[89]

S Shyam Sundar and Sampada S Marathe. 2010. Personalization versus customization: The importance of agency, privacy, and power usage. Human communication research 36, 3 (2010), 298–322. https://doi.org/10.1111/j.1468-2958.2010.01377.x

[90]

Jieun Sung, Torger Bjornrud, Yu-hao Lee, and D. Yvette Wohn. 2010. Social network games: exploring audience traits. In CHI ’10 Extended Abstracts on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI EA ’10). Association for Computing Machinery, New York, NY, USA, 3649–3654. https://doi.org/10.1145/1753846.1754033

Digital Library

[91]

Mohan Sunkara, Yash Prakash, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2023. Enabling Customization of Discussion Forums for Blind Users. Proc. ACM Hum.-Comput. Interact. 7, EICS, Article 176 (jun 2023), 20 pages. https://doi.org/10.1145/3593228

Digital Library

[92]

Terril Thompson. 2019. Audio Description using the Web Speech API. https://terrillthompson.com/1173. Accessed: 2020-11-6.

[93]

JF Vera. 2006. Translating audio description scripts: the way forward? Tentative first stage project results. In MuTra 2006 Audio Visual Translation Scenarios: Conference proceedings. 148–181.

[94]

W3C. 2022. Descriptions of Visual Information.https://www.w3.org/WAI/media/av/description/. Accessed: 2022-11-6.

[95]

W3C. 2023. Extended Audio Description (Prerecorded) (Level AAA). https://www.w3.org/TR/WCAG20-TECHS/G8.html. Accessed: 2023-04-09.

[96]

W3C. 2024. Extended Audio Description (Prerecorded): Understanding SC 1.2.7. https://www.w3.org/TR/UNDERSTANDING-WCAG20/media-equiv-extended-ad.html. Accessed: 2024-03-14.

[97]

W3C Web Accessibility Initiative (WAI). 2016. Web Accessibility Perspectives: Customizable Text - Audio Described Version. https://youtu.be/L4WLeVc5l5k. Accessed: 2023-11-6.

[98]

W3C Web Accessibility Initiative (WAI). 2016. Web Accessibility Perspectives: Video Captions - Audio Described Version. https://youtu.be/4qIordU8vT8. Accessed: 2023-11-6.

[99]

Agnieszka Walczak and Louise Fryer. 2018. Vocal delivery of audio description by genre: measuring users’ presence. Perspectives 26, 1 (2018), 69–83.

[100]

Yujia Wang, Wei Liang, Haikun Huang, Yongqi Zhang, Dingzeyu Li, and Lap-Fai Yu. 2021. Toward Automatic Audio Description Generation for Accessible Videos. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 277, 12 pages. https://doi.org/10.1145/3411764.3445347

Digital Library

[101]

Chunlei Wu, Camilo Orozco, Jason Boyer, Marc Leglise, James Goodale, Serge Batalov, Christopher L Hodge, James Haase, Jeff Janes, Jon W Huss, 2009. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome biology 10, 11 (2009), 1–8. https://doi.org/10.1186/gb-2009-10-11-r130

[102]

Ting Wu, Junjie Peng, Wenqiang Zhang, Huiran Zhang, Shuhua Tan, Fen Yi, Chuanshuai Ma, and Yansong Huang. 2022. Video sentiment analysis with bimodal information-augmented multi-head attention. Knowledge-Based Systems 235 (2022), 107676. https://doi.org/10.1016/j.knosys.2021.107676

Digital Library

[103]

Sumit K Yadav, Mayank Bhushan, and Swati Gupta. 2015. Multimodal sentiment analysis: Sentiment analysis using audiovisual format. In 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). 1415–1419.

[104]

UTKATA Office Chair Yoga. 2020. 1 Minute Office Chair Yoga - Yoga at your Desk - Flow #1. https://youtu.be/vgf21Tqfwwg. Accessed: 2023-11-6.

[105]

YouDescribe. 2020. YouDescribe Audio Description Guideline. https://youdescribe.org/support/tutorial. Accessed: 2020-11-6.

[106]

YouTube. 2023. YouTube. http://www.youtube.com. Accessed: 2023-09-14.

[107]

Beste F. Yuksel, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Soo Jung Kim, Joshua Junhee Lee, Seung Jung Jin, Yue-Ting Siu, Joshua A. Miele, and Ilmi Yoon. 2020. Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users. In Proceedings of the 2020 ACM Designing Interactive Systems Conference (Eindhoven, Netherlands) (DIS ’20). Association for Computing Machinery, New York, NY, USA, 47–60. https://doi.org/10.1145/3357236.3395433

Digital Library

[108]

Beste F. Yuksel, Soo Jung Kim, Seung Jung Jin, Joshua Junhee Lee, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Ilmi Yoon, Yue-Ting Siu, and Joshua A. Miele. 2020. Increasing Video Accessibility for Visually Impaired Users with Human-in-the-Loop Machine Learning. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3382821

Digital Library

Cited By

Chang RLiu YGuo A(2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676375

Index Terms

Audio Description Customization
1. Human-centered computing
  1. Accessibility
  2. Human computer interaction (HCI)
    1. Interaction techniques
      1. Auditory feedback
2. Social and professional topics
  1. Professional topics
    1. Computing profession
      1. Assistive technologies
  2. User characteristics
    1. People with disabilities

Index terms have been assigned to the content through auto-classification.

Recommendations

Beyond Audio Description: Exploring 360° Video Accessibility with Blind and Low Vision Users Through Collaborative Creation
ASSETS '23: Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility

While audio description (AD) is a standard method for making traditional videos more accessible to blind and low vision (BLV) users, we lack an understanding of how to make 360° videos accessible while preserving their immersive nature. Through ...
Toward Automatic Audio Description Generation for Accessible Videos
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Video accessibility is essential for people with visual impairments. Audio descriptions describe what is happening on-screen, e.g., physical actions, facial expressions, and scene changes. Generating high-quality audio descriptions requires a lot of ...
Deploying Prerecorded Audio Description for Musical Theater Using Live Performance Tracking
Perception, Representations, Image, Sound, Music
Abstract
Audio description, an accessibility service used by blind or visually impaired individuals, provides spoken descriptions of visual content. This alternative format allows those with low or no vision the ability to access information that sighted ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility

October 2024

1475 pages

ISBN:9798400706776

DOI:10.1145/3663548

Editors:
David Flatla
University of Guelph, CANADA
,
Faustina Hwang
University of Reading, UNITED KINGDOM
,
Tiago Guerreiro
University of Lisbon, PORTUGAL
,
Robin Brewer
University of Michigan, UNITED STATES

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGACCESS: ACM Special Interest Group on Accessible Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Ministry of Education, Singapore

Conference

ASSETS '24

Sponsor:

SIGACCESS

ASSETS '24: The 26th International ACM SIGACCESS Conference on Computers and Accessibility

October 27 - 30, 2024

NL, St. John's, Canada

Acceptance Rates

Overall Acceptance Rate 436 of 1,556 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
351
Total Downloads

Downloads (Last 12 months)351
Downloads (Last 6 weeks)81

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chang RLiu YGuo A(2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676375

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten