Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3663548.3675617acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
Open access

Audio Description Customization

Published: 27 October 2024 Publication History


Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals’ potentially diverse needs and preferences. This research investigates if customizing AD could improve how BLV individuals consume videos. We conducted an interview study (Study 1) with fifteen BLV participants, which revealed desires for customizing properties like length, emphasis, speed, voice, format, tone, and language. At the same time, concerns like interruptions and increased interaction load due to customization emerged. To examine AD customization’s effectiveness and tradeoffs, we designed CustomAD, a prototype that enables BLV users to customize AD content and presentation. An evaluation study (Study 2) with twelve BLV participants showed using CustomAD significantly enhanced BLV people’s video understanding, immersion, and information navigation efficiency. Our work illustrates the importance of AD customization and offers a design that enhances video accessibility for BLV individuals.

Supplemental Material

ZIP File
Appendix_CurrentVersion_Accessible_CameraReady.pdf = Supplementary Material in PDF format that consists of the Appendix related to the paper. We put the references of the paper on which appendix to refer to in the document.


3PlayMedia. 2020. Beginner’s Guide to Audio Description. https://go.3playmedia.com/hubfs/WP%20PDFs/Beginners-Guide-to-Audio-Description.pdf. Accessed: 2021-01-13.
G. Abla, E.N. Kim, D.P. Schissel, and S.M. Flanagan. 2010. Customizable scientific web portal for fusion research. Fusion Engineering and Design 85, 3 (2010), 603–607. https://doi.org/10.1016/j.fusengdes.2010.02.030 Proceedings of the 7th IAEA Technical Meeting on Control, Data Acquisition, and Remote Participation for Fusion Research.
NV Access. 2023. NVDA Version 2023.3.4. https://www.nvaccess.org/download/. Accessed: 2023-09-14.
ADLab. 2024. ADLab Audio Descriptioon: Lifeling Access for the Blind. http://www.adlabproject.eu/Docs/adlab%20book/. Accessed: 2024-03-14.
Seeing AI. 2024. Seeing AI: Talking Camera for the Blind. https://www.seeingai.com/. Accessed: 2024-09-24.
Amazon. 2023. Amazon Prime. http://www.amazon.com. Accessed: 2023-09-14.
Audio Description Project American Council of the Blind. 2017. Guideline for Audio Describers. https://www.acb.org/adp/guidelines.html. Accessed: 2020-11-6.
Apple. 2023. Chapter 1. Introducing VoiceOver. https://www.apple.com/voiceover/info/guide/_1121.html. Accessed: 2023-09-14.
Vision Australia. 2015. Audio Description on TV. https://youtu.be/ULgLZn91TMo. Accessed: 2023-11-6.
Michael A. Beam and Gerald M. Kosicki. 2014. Personalized News Portals: Filtering Systems and Increased News Exposure. Journalism & Mass Communication Quarterly 91, 1 (2014), 59–77. https://doi.org/10.1177/1077699013514411 arXiv:https://doi.org/10.1177/1077699013514411
Raju Shrestha Bineeth Kuriakose and Frode Eika Sandnes. 2022. Tools and Technologies for Blind and Visually Impaired Navigation Support: A Review. IETE Technical Review 39, 1 (2022), 3–18. https://doi.org/10.1080/02564602.2020.1819893 arXiv:https://doi.org/10.1080/02564602.2020.1819893
Australian Communications Consumer Action Network Media Access Australia Blind Citizen Australia, Vision Australia. 2024. Blindness Sector Report on the 2012 ABC Audio Description Trial. https://www.bca.org.au/wp- content/uploads/2018/04/Blindness_Sector_Report_on_the_2012_ABC_Audio_Description_Trial.doc. Accessed: 2024-03-14.
Danielle Bragg, Katharina Reinecke, and Richard E. Ladner. 2021. Expanding a Large Inclusive Study of Human Listening Rates. ACM Trans. Access. Comput. 14, 3, Article 12 (jul 2021), 26 pages. https://doi.org/10.1145/3461700
Northern German Broadcasting. 2023. Audio description guidelines. https://www.ndr.de/fernsehen/barrierefreie_angebote/audiodeskription/Audio-description-guidelines,audiodeskription142.html. Accessed: 2023-04-09.
Andrea Bunt, Cristina Conati, and Joanna McGrenere. 2007. Supporting interface customization using a mixed-initiative approach. In Proceedings of the 12th International Conference on Intelligent User Interfaces (Honolulu, Hawaii, USA) (IUI ’07). Association for Computing Machinery, New York, NY, USA, 92–101. https://doi.org/10.1145/1216295.1216317
Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg Vanderheiden, Wendy Chisholm, John Slatin, and Jason White. 2008. Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C) (2008).
Virginia P Campos, Tiago MU de Araújo, Guido L de Souza Filho, and Luiz MG Gonçalves. 2020. CineAD: a system for automated audio description script generation for the visually impaired. Universal Access in the Information Society 19, 1 (2020), 99–111. https://doi.org/10.1007/s10209-018-0634-4
Media Access Canada. 2023. DESCRIPTIVE VIDEO PRODUCTION AND PRESENTATION BEST PRACTICES GUIDE FOR DIGITAL ENVIRONMENTS. http://www.mediac.ca/DVBPGDE_V2_28Feb2012.asp. Accessed: 2023-04-09.
EndTheCycle CBM. 2023. My Story: Maria (with Extended Audio Description). https://youtu.be/JUIJ_aNxsG8. Accessed: 2023-11-6.
Ruei-Che Chang, Chao-Hsien Ting, Chia-Sheng Hung, Wan-Chen Lee, Liang-Jin Chen, Yu-Tzu Chao, Bing-Yu Chen, and Anhong Guo. 2022. OmniScribe: Authoring Immersive Audio Descriptions for 360° Videos. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 15, 14 pages. https://doi.org/10.1145/3526113.3545613
Agnieszka Chmiel and Iwona Mazur. 2016. Researching preferences of audio description users—Limitations and solutions. Across Languages and Cultures 17, 2 (2016), 271–288. https://doi.org/10.1556/084.2016.17.2.7
Agnieszka Chmiel and Iwona Mazur. 2022. A homogenous or heterogeneous audience? Audio description preferences of persons with congenital blindness, non-congenital blindness and low vision. Perspectives 30, 3 (2022), 552–567. https://doi.org/10.1080/0907676X.2021.1913198 arXiv:https://doi.org/10.1080/0907676X.2021.1913198
Lacey Colligan, Henry WW Potts, Chelsea T Finn, and Robert A Sinkin. 2015. Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record. International journal of medical informatics 84, 7 (2015), 469–476. https://doi.org/10.1016/j.ijmedinf.2015.03.003
Federal Communications Commission. 2020. 21st Century Communications and Video Accessibility Act (CVAA). https://www.fcc.gov/consumers/guides/21st-century-communications-and-video-accessibility-act-cvaa. Accessed: 2020-11-6.
Barry J Cronin and Sharon Robertson King. 1990. The Development of the Descriptive Video Servicesm. Journal of Visual Impairment & Blindness 84, 10 (1990), 503–506.
Ionut Damian, Birgit Endrass, Peter Huber, Nikolaus Bee, and Elisabeth André. 2011. Individualized Agent Interactions. In Motion in Games, Jan M. Allbeck and Petros Faloutsos (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 15–26.
Vagner Figueredo de Santana, Rosimeire de Oliveira, Leonelo Dell Anhol Almeida, and Marcia Ito. 2013. Firefixia: an accessibility web browser customization toolbar for people with dyslexia. In Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility (Rio de Janeiro, Brazil) (W4A ’13). Association for Computing Machinery, New York, NY, USA, Article 16, 4 pages. https://doi.org/10.1145/2461121.2461137
Described and Captioned Media Program. 2020. Described and Captioned Media Program (DCMP). http://www.descriptionkey.org/quality_description.html. Accessed: 2019-03-19.
Disney+. 2023. Disney+. http://www.disneyplus.com. Accessed: 2023-09-14.
Jane Doe. 2016. Audio Description - Full Clip. https://youtu.be/7-XOHN2BWG4. Accessed: 2023-11-6.
James W Drisko and Tina Maschi. 2016. Content analysis. Oxford University Press, USA.
Be My Eyes. 2024. Be My Eyes. https://www.bemyeyes.com/. Accessed: 2024-09-24.
USDA Food and Nutrition Service. 2022. CACFP Cooking Video: Cheesy Bean Tostada Ages 6-18, with Audio Description. https://youtu.be/9H8Ch1tcaCs. Accessed: 2023-11-6.
American Foundation for the Blind. 2024. BeSpecular: A New Remote Assitant Service. https://www.afb.org/aw/17/7/15313. Accessed: 2024-09-24.
Louise Fryer. 2016. An introduction to audio description: A practical guide. Routledge.
L. Gagnon, C. Chapdelaine, D. Byrns, S. Foucher, M. Héritier, and V. Gupta. 2010. A computer-vision-assisted system for Videodescription scripting. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops. 41–48. https://doi.org/10.1109/CVPRW.2010.5543575
Krzysztof Gajos and Daniel S. Weld. 2004. SUPPLE: automatically generating user interfaces. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 93–100. https://doi.org/10.1145/964442.964461
Sony Global. 2023. Sony’s Purpose (with Audio Description) | Official Video. https://youtu.be/7Tiem2QBS0U. Accessed: 2023-11-6.
Leo A Goodman. 1961. Snowball sampling. The annals of mathematical statistics (1961), 148–170.
Google. 2023. The Latest YouTube Stats on When, Where, and What people watch. https://www.thinkwithgoogle.com/consumer-insights/consumer-trends/youtube-stats-video-consumption-trends/. Accessed: 2023-09-12.
Joan Greening and Deborah Rolph. 2007. Accessibility: raising awareness of audio description in the UK. In Media for All. Brill, 127–138.
Danna Gurari, Yinan Zhao, Meng Zhang, and Nilavra Bhattacharya. 2020. Captioning Images Taken by People Who Are Blind. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 417–434.
Kari Halsted and James Roberts. 2002. Eclipse help system: an open source user assistance offering. In Proceedings of the 20th annual international conference on Computer documentation. 49–59. https://doi.org/10.1145/584955.584964
Tania Heap, Regina Kaplan-Rakowski, and Audon Archibald. 2023. Experiencing Virtual Reality for Perspective-Taking of Blind and Visually Impaired Learners. Available at SSRN 4595370 (2023).
Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, and Anhong Guo. 2023. Hacking, Switching, Combining: Understanding and Supporting DIY Assistive Technology Design by Blind People. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 57, 17 pages. https://doi.org/10.1145/3544548.3581249
Daniel J Hruschka, Deborah Schwartz, Daphne Cobb St. John, Erin Picone-Decaro, Richard A Jenkins, and James W Carey. 2004. Reliability in coding open-ended data: Lessons learned from HIV behavioral research. Field methods 16, 3 (2004), 307–331. https://doi.org/10.1177/1525822X04266540
Mina Huh, Yi-Hao Peng, and Amy Pavel. 2023. GenAssist: Making Image Generation Accessible. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 38, 17 pages. https://doi.org/10.1145/3586183.3606735
Mina Huh, Saelyne Yang, Yi-Hao Peng, Xiang ’Anthony’ Chen, Young-Ho Kim, and Amy Pavel. 2023. AVscript: Accessible Video Editing with Audio-Visual Scripts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 796, 17 pages. https://doi.org/10.1145/3544548.3581494
Amy Hurst, Scott E. Hudson, and Jennifer Mankoff. 2007. Dynamic detection of novice vs. skilled use without a task model. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association for Computing Machinery, New York, NY, USA, 271–280. https://doi.org/10.1145/1240624.1240669
IMSTVUK. 2013. Frozen - Trailer with Audio Description). https://youtu.be/O7j4_aP8dWA. Accessed: 2023-11-6.
World-Wide Web COnsortium Web Accessibility Initiative. 2023. Making the Web-Accessible. https://www.w3.org/WAI/. Accessed: 2023-11-6.
Lucy Jiang, Crescentia Jung, Mahika Phutane, Abigale Stangl, and Shiri Azenkot. 2024. “It’s Kind of Context Dependent”: Understanding Blind and Low Vision People’s Video Accessibility Preferences Across Viewing Scenarios. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 897, 20 pages. https://doi.org/10.1145/3613904.3642238
Lucy Jiang and Richard Ladner. 2022. Co-Designing Systems to Support Blind and Low Vision Audio Description Writers. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility (Athens, Greece) (ASSETS ’22). Association for Computing Machinery, New York, NY, USA, Article 74, 3 pages. https://doi.org/10.1145/3517428.3550394
Lucy Jiang, Mahika Phutane, and Shiri Azenkot. 2023. Beyond Audio Description: Exploring 360° Video Accessibility with Blind and Low Vision Users Through Collaborative Creation. In Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility (New York, NY, USA) (ASSETS ’23). Association for Computing Machinery, New York, NY, USA, Article 50, 17 pages. https://doi.org/10.1145/3597638.3608381
Patrick W Jordan, Bruce Thomas, Ian Lyall McClelland, and Bernard Weerdmeester. 1996. Usability evaluation in industry. CRC Press.
Ivana Katsarova. 2018. The audiovisual media services directive. Briefing EU [Legislation in Progress,] European Parliament (2018).
Irina Kegishyan. 2023. Mobile Video Statistics. https://www.yansmedia.com/blog/mobile-video-statistics. Accessed: 2023-04-09.
Heather Kennedy-Eden and Ulrike Gretzel. 2012. A Taxonomy of Mobile Applications in Tourism. e-Review of Tourism Research (eRTR) 10 (01 2012), 47–50.
Em’s Kitchen. 2021. 3 Ingredient Nutella Mug Cake 2 Ways. https://youtu.be/sItYaC1z_d0. Accessed: 2023-11-6.
Masatomo Kobayashi, Kentarou Fukuda, Hironobu Takagi, and Chieko Asakawa. 2009. Providing synthesized audio description for online videos. In Proceedings of the 11th International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, Pennsylvania, USA) (ASSETS ’09). Association for Computing Machinery, New York, NY, USA, 249–250. https://doi.org/10.1145/1639642.1639699
Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara Berg, and Mohit Bansal. 2020. MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 2603–2614. https://doi.org/10.18653/v1/2020.acl-main.233
Hoi Ching Dawning Leung. 2018. Audio description of audiovisual programmes for the visually impaired in Hong Kong. Ph. D. Dissertation. UCL (University College London).
Xingyu Liu, Patrick Carrington, Xiang ’Anthony’ Chen, and Amy Pavel. 2021. What Makes Videos Accessible to Blind and Visually Impaired People?. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 272, 14 pages. https://doi.org/10.1145/3411764.3445233
Alice Lo Valvo, Daniele Croce, Domenico Garlisi, Fabrizio Giuliano, Laura Giarré, and Ilenia Tinnirello. 2021. A navigation and augmented reality system for visually impaired people. Sensors 21, 9 (2021), 3061. https://doi.org/10.3390/s21093061
Mariana Lopez, Gavin Kearney, and Krisztián Hofstädter. 2018. Audio Description in the UK: What works, what doesn’t, and understanding the need for personalising access. British journal of visual impairment 36, 3 (2018), 274–291.
Microsoft. 2024. Miscrosoft Soundscape: A map delivered in 3D Sound. https://www.microsoft.com/en-us/research/product/soundscape/. Accessed: 2024-09-24.
Mario Montagud, Pilar Orero, and Anna Matamala. 2020. Culture 4 all: accessibility-enabled cultural experiences through immersive VR360 content. Personal and Ubiquitous Computing 24, 6 (2020), 887–905. https://doi.org/10.1007/s00779-019-01357-3
Sarah Morley. 1998. Digital talking books on a PC: a usability evaluation of the prototype DAISY playback software. In Proceedings of the Third International ACM Conference on Assistive Technologies (Marina del Rey, California, USA) (Assets ’98). Association for Computing Machinery, New York, NY, USA, 157–164. https://doi.org/10.1145/274497.274527
Rosiana Natalie, Ebrima Jarjue, Hernisa Kacorri, and Kotaro Hara. 2020. ViScene: A Collaborative Authoring Tool for Scene Descriptions in Videos. In Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, Greece) (ASSETS ’20). Association for Computing Machinery, New York, NY, USA, Article 87, 4 pages. https://doi.org/10.1145/3373625.3418030
Rosiana Natalie, Jolene Loh, Huei Suen Tan, Joshua Tseng, Ian Luke Yi-Ren Chan, Ebrima H Jarjue, Hernisa Kacorri, and Kotaro Hara. 2021. The Efficacy of Collaborative Authoring of Video Scene Descriptions. In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article 17, 15 pages. https://doi.org/10.1145/3441852.3471201
Rosiana Natalie, Joshua Tseng, Hernisa Kacorri, and Kotaro Hara. 2023. Supporting Novices Author Audio Descriptions via Automatic Feedback. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 77, 18 pages. https://doi.org/10.1145/3544548.3581023
Netflix. 2020. Audio Description Style Guide v2.1. https://partnerhelp.netflixstudios.com/hc/en-us/articles/215510667-Audio-Description-Style-Guide-v2-1. Accessed: 2020-11-6.
Netflix. 2023. Netflix. http://www.netflix.com. Accessed: 2023-09-14.
Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, and Haibin Ling. 2022. Expanding Language-Image Pretrained Models for General Video Recognition. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 1–18.
Amy Pavel, Gabriel Reyes, and Jeffrey P. Bigham. 2020. Rescribe: Authoring and Automatically Editing Audio Descriptions. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 747–759. https://doi.org/10.1145/3379337.3415864
Peter Pawlowski. 2010. Basic Player Whose Appearance and Functions can be Customized Freely ‘Foobar 2000’v1. 0 is Unveiled. Windows Forest, Japan, Jan 12 (2010), 3.
Pictory. 2023. What are the Most Popular Genres on YouTube in 2023?https://pictory.ai/blog/what-are-the-most-popular-genres-on-youtube-in-2023?el=0035&htrafficsource=pictoryblog&hcategory=video. Accessed: 2024-04-18.
Able Player. 2020. Able Player: Fuly Accessible cross-browser HTML Media Player. https://www.3playmedia.com/services/features/plugins/3play-plugin/. Accessed: 2020-11-6.
Antoine Ponsard and Joanna McGrenere. 2016. Anchored Customization: Anchoring Settings to the Application Interface to Afford Customization. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, NY, USA, 4154–4165. https://doi.org/10.1145/2858036.2858129
Audio Description Project. 2023. Recommendation of the Federal Communications Commission disability...https://adp.acb.org/docs/DAC%20Recommendation%20on%20Audo%20Description%20Quality%20Adopted%20October%2014%202020.pdf
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.
Batul Saati, May Salem, and Willem-Paul Brinkman. 2005. Towards customized user interface skins: investigating user personality and skin colour. Proceedings of HCI 2005 2 (2005), 89–93.
Freedom Scientific. 2023. JAWS. https://www.freedomscientific.com/products/software/jaws/. Accessed: 2023-09-14.
Abigale Stangl, Meredith Ringel Morris, and Danna Gurari. 2020. "Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376404
Abigale Stangl, Nitin Verma, Kenneth R. Fleischmann, Meredith Ringel Morris, and Danna Gurari. 2021. Going Beyond One-Size-Fits-All Image Descriptions to Satisfy the Information Wants of People Who are Blind or Have Low Vision. In Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (Virtual Event, USA) (ASSETS ’21). Association for Computing Machinery, New York, NY, USA, Article 16, 15 pages. https://doi.org/10.1145/3441852.3471233
Rena Strober. 2020. Imagine That! Music video with AUDIO DESCRIPTION. https://youtu.be/UXz9AtO_kl0. Accessed: 2023-11-6.
Walt Disney Animation Studios. 2013. Disney’s Frozen "Party Is Over" Clip). https://youtu.be/jNuZC5_9pQQ. Accessed: 2023-11-6.
S Shyam Sundar and Sampada S Marathe. 2010. Personalization versus customization: The importance of agency, privacy, and power usage. Human communication research 36, 3 (2010), 298–322. https://doi.org/10.1111/j.1468-2958.2010.01377.x
Jieun Sung, Torger Bjornrud, Yu-hao Lee, and D. Yvette Wohn. 2010. Social network games: exploring audience traits. In CHI ’10 Extended Abstracts on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI EA ’10). Association for Computing Machinery, New York, NY, USA, 3649–3654. https://doi.org/10.1145/1753846.1754033
Mohan Sunkara, Yash Prakash, Hae-Na Lee, Sampath Jayarathna, and Vikas Ashok. 2023. Enabling Customization of Discussion Forums for Blind Users. Proc. ACM Hum.-Comput. Interact. 7, EICS, Article 176 (jun 2023), 20 pages. https://doi.org/10.1145/3593228
Terril Thompson. 2019. Audio Description using the Web Speech API. https://terrillthompson.com/1173. Accessed: 2020-11-6.
JF Vera. 2006. Translating audio description scripts: the way forward? Tentative first stage project results. In MuTra 2006 Audio Visual Translation Scenarios: Conference proceedings. 148–181.
W3C. 2022. Descriptions of Visual Information.https://www.w3.org/WAI/media/av/description/. Accessed: 2022-11-6.
W3C. 2023. Extended Audio Description (Prerecorded) (Level AAA). https://www.w3.org/TR/WCAG20-TECHS/G8.html. Accessed: 2023-04-09.
W3C. 2024. Extended Audio Description (Prerecorded): Understanding SC 1.2.7. https://www.w3.org/TR/UNDERSTANDING-WCAG20/media-equiv-extended-ad.html. Accessed: 2024-03-14.
W3C Web Accessibility Initiative (WAI). 2016. Web Accessibility Perspectives: Customizable Text - Audio Described Version. https://youtu.be/L4WLeVc5l5k. Accessed: 2023-11-6.
W3C Web Accessibility Initiative (WAI). 2016. Web Accessibility Perspectives: Video Captions - Audio Described Version. https://youtu.be/4qIordU8vT8. Accessed: 2023-11-6.
Agnieszka Walczak and Louise Fryer. 2018. Vocal delivery of audio description by genre: measuring users’ presence. Perspectives 26, 1 (2018), 69–83.
Yujia Wang, Wei Liang, Haikun Huang, Yongqi Zhang, Dingzeyu Li, and Lap-Fai Yu. 2021. Toward Automatic Audio Description Generation for Accessible Videos. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 277, 12 pages. https://doi.org/10.1145/3411764.3445347
Chunlei Wu, Camilo Orozco, Jason Boyer, Marc Leglise, James Goodale, Serge Batalov, Christopher L Hodge, James Haase, Jeff Janes, Jon W Huss, 2009. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome biology 10, 11 (2009), 1–8. https://doi.org/10.1186/gb-2009-10-11-r130
Ting Wu, Junjie Peng, Wenqiang Zhang, Huiran Zhang, Shuhua Tan, Fen Yi, Chuanshuai Ma, and Yansong Huang. 2022. Video sentiment analysis with bimodal information-augmented multi-head attention. Knowledge-Based Systems 235 (2022), 107676. https://doi.org/10.1016/j.knosys.2021.107676
Sumit K Yadav, Mayank Bhushan, and Swati Gupta. 2015. Multimodal sentiment analysis: Sentiment analysis using audiovisual format. In 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom). 1415–1419.
UTKATA Office Chair Yoga. 2020. 1 Minute Office Chair Yoga - Yoga at your Desk - Flow #1. https://youtu.be/vgf21Tqfwwg. Accessed: 2023-11-6.
YouDescribe. 2020. YouDescribe Audio Description Guideline. https://youdescribe.org/support/tutorial. Accessed: 2020-11-6.
YouTube. 2023. YouTube. http://www.youtube.com. Accessed: 2023-09-14.
Beste F. Yuksel, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Soo Jung Kim, Joshua Junhee Lee, Seung Jung Jin, Yue-Ting Siu, Joshua A. Miele, and Ilmi Yoon. 2020. Human-in-the-Loop Machine Learning to Increase Video Accessibility for Visually Impaired and Blind Users. In Proceedings of the 2020 ACM Designing Interactive Systems Conference (Eindhoven, Netherlands) (DIS ’20). Association for Computing Machinery, New York, NY, USA, 47–60. https://doi.org/10.1145/3357236.3395433
Beste F. Yuksel, Soo Jung Kim, Seung Jung Jin, Joshua Junhee Lee, Pooyan Fazli, Umang Mathur, Vaishali Bisht, Ilmi Yoon, Yue-Ting Siu, and Joshua A. Miele. 2020. Increasing Video Accessibility for Visually Impaired Users with Human-in-the-Loop Machine Learning. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3382821

Cited By

View all
  • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024



Information & Contributors


Published In

cover image ACM Conferences
ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility
October 2024
1475 pages
This work is licensed under a Creative Commons Attribution International 4.0 License.



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2024

Check for updates

Author Tags

  1. Accessibility
  2. Audio Description
  3. Blind and Low-vision Individual
  4. Customization
  5. Video Accessibility


  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Ministry of Education, Singapore



Acceptance Rates

Overall Acceptance Rate 436 of 1,556 submissions, 28%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)351
  • Downloads (Last 6 weeks)81
Reflects downloads up to 26 Jan 2025

Other Metrics


Cited By

View all
  • (2024)WorldScribe: Towards Context-Aware Live Visual DescriptionsProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676375(1-18)Online publication date: 13-Oct-2024

View Options

View options


View or Download as a PDF file.



View online with eReader.


HTML Format

View this article in HTML Format.

HTML Format

Login options






Share this Publication link

Share on social media