poster

Providing synthesized audio description for online videos

Authors:

Masatomo Kobayashi,

Kentarou Fukuda,

Hironobu Takagi,

Chieko AsakawaAuthors Info & Claims

Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility

Pages 249 - 250

https://doi.org/10.1145/1639642.1639699

Published: 25 October 2009 Publication History

Get Access

Abstract

We describe an initial attempt to develop a common platform for adding an audio description (AD) to an online video so that blind and visually impaired people can enjoy such material. A speech synthesis technology allows content providers to offer the AD at minimal cost. We exploit external metadata so that the AD can be independent of the video format. The external approach also allows external supporters to add ADs to any online videos. Our technology includes an authoring tool for writing AD scripts, a Web browser add-on for synthesizing ADs synchronized with original videos, and a text-based format to exchange AD scripts.

References

[1]

Fukuda, T., Ichikawa, O., and Nishimura, M. Phone-duration-dependent Long-term Dynamic Features for Stochastic Model-based Voice Activity Detection, In Proceedings of ICSLP 2008/Interspeech 2008, ISCA, 2008, pp. 1293--1296.

Google Scholar

[2]

Miyashita, H., Sato, D., Takagi, H., and Asakawa, C. aiBrowser for Multimedia: Introducing Multimedia Content Accessibility for Visually Impaired Users, In Proceedings of ASSETS '07, ACM, 2007, pp. 91--98.

Digital Library

Google Scholar

[3]

Miyashita, H., Sato, D., Takagi, H., and Asakawa, C. Making Multimedia Content Accessible for Screen Reader Users, In Proceedings of W4A '07, ACM, 2007, pp. 126--127.

Digital Library

Google Scholar

[4]

Takagi, H., Kawanaka, S., Kobayashi, M., Itoh, T., and Asakawa, C. Social Accessibility: Achieving Accessibility through Collaborative Metadata Authoring. In Proceedings of ASSETS '08, ACM, 2008, pp. 193--200.

Digital Library

Google Scholar

[5]

CapScribe, http://capscribe.snow.utoronto.ca/

Google Scholar

[6]

CNN Video, http://www.cnn.com/video/

Google Scholar

[7]

LiveDescribe, http://www.livedescribe.com/

Google Scholar

[8]

MAGpie, http://ncam.wgbh.org/webaccess/magpie/

Google Scholar

[9]

Section 508, http://www.section508.gov/

Google Scholar

[10]

Synchronized Multimedia Integration Language (SMIL 3.0), http://www.w3.org/TR/SMIL/

Google Scholar

[11]

Speech Synthesis Markup Language (SSML) Version 1.0, http://www.w3.org/TR/speech-synthesis/

Google Scholar

[12]

Timed Text (TT) Authoring Format 1.0 - Distribution Format Exchange Profile (DFXP), http://www.w3.org/TR/ttaf1-dfxp/

Google Scholar

[13]

Web Contents Accessibility Guidelines (WCAG) 2.0, http://www.w3.org/TR/WCAG20/

Google Scholar

[14]

YouTube, http://www.youtube.com/

Google Scholar

Cited By

View all

Natalie RChang RSheshadri SGuo AHara K(2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675617
Nevsky ABircanin FCruice MWilson SSimperl ENeate T(2024)"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with AphasiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675598(1-17)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675598
Ning ZWimer BJiang KChen KBan JTian YZhao YLi T(2024)SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision ViewersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642632(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642632
Show More Cited By

Index Terms

Providing synthesized audio description for online videos

Recommendations

What Makes Videos Accessible to Blind and Visually Impaired People?
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description ...
Toward Automatic Audio Description Generation for Accessible Videos
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Video accessibility is essential for people with visual impairments. Audio descriptions describe what is happening on-screen, e.g., physical actions, facial expressions, and scene changes. Generating high-quality audio descriptions requires a lot of ...
Are synthesized video descriptions acceptable?
ASSETS '10: Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility

We conducted a series of experiments to assess the feasibility of synthesized narrations to describe online videos. To reduce the cultural bias, we included adult blind or low-vision participants from Japan and the U.S. in the main study. Our research ...

Comments

Information & Contributors

Information

Published In

Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility

October 2009

290 pages

ISBN:9781605585581

DOI:10.1145/1639642

General Chair:
Shari Trewin
IBM T. J. Watson Research Center, USA
,
Program Chair:
Kathleen F. McCoy
University of Delaware, USA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

ASSETS09

Sponsor:

SIGACCESS

ASSETS09: The 11th International ACM SIGACCESS Conference on Computers and Accessibility

October 25 - 28, 2009

Pennsylvania, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 436 of 1,556 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
350
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)2

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Natalie RChang RSheshadri SGuo AHara K(2024)Audio Description CustomizationProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675617(1-19)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675617
Nevsky ABircanin FCruice MWilson SSimperl ENeate T(2024)"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with AphasiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675598(1-17)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675598
Ning ZWimer BJiang KChen KBan JTian YZhao YLi T(2024)SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision ViewersProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642632(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642632
Nevsky ANeate TSimperl EVatavu R(2023)Accessibility Research in Digital Audiovisual Media: What Has Been Achieved and What Should Be Done Next?Proceedings of the 2023 ACM International Conference on Interactive Media Experiences10.1145/3573381.3596159(94-114)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3573381.3596159
Natalie RTseng JKacorri HHara K(2023)Supporting Novices Author Audio Descriptions via Automatic FeedbackProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581023(1-18)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581023
Liu XWang RLi DChen XPavel A(2022)CrossA11y: Identifying Video Accessibility Issues via Cross-modal GroundingProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545703(1-14)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545703
Li FSpektor FXia MHuh MCederberg PGong YShinohara KCarrington P(2022)“It Feels Like Taking a Gamble”: Exploring Perceptions, Practices, and Challenges of Using Makeup and Cosmetics for People with Visual ImpairmentsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517490(1-15)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3517490
Schneider RAbleitner TZimmermann G(2022)Layered Audio Descriptions for VideosComputers Helping People with Special Needs10.1007/978-3-031-08645-8_7(51-63)Online publication date: 11-Jul-2022
https://dl.acm.org/doi/10.1007/978-3-031-08645-8_7
Filho IHonorato FLucena JTeixeira JMaritan TPereira Ada Rocha L(2021)An Approach for Automatic Description of Characters for Blind PeopleProceedings of the Brazilian Symposium on Multimedia and the Web10.1145/3470482.3479617(53-56)Online publication date: 5-Nov-2021
https://dl.acm.org/doi/10.1145/3470482.3479617
Natalie RLoh JTan HTseng JKacorri HHara K(2021)Uncovering Patterns in Reviewers’ Feedback to Scene Description AuthorsProceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3441852.3476550(1-4)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3441852.3476550
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

What Makes Videos Accessible to Blind and Visually Impaired People?

Toward Automatic Audio Description Generation for Accessible Videos

Are synthesized video descriptions acceptable?