Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2582051.2582070acmotherconferencesArticle/Chapter ViewAbstractPublication PagesahConference Proceedingsconference-collections
research-article

Two-level fast-forwarding using speech detection for rapidly perusing video

Published: 07 March 2014 Publication History

Abstract

In video content such as feature films, the main themes and messages are often sufficiently conveyed through dialogue and narration. To augment human capability to consume video content, here we propose a system for watching such videos at very high speed while ensuring that speech is still comprehensible. Specifically, we employ a purpose-built automatic speech detector to realize two-level fast-forwarding for a wide variety of video content: very fast during segments without speech, and understandably fast during segments with speech. In our experiments, practical performance was achieved by frame-by-frame audio classification using Gaussian mixture models trained on subtitle information from 120 commercial DVD movies.

References

[1]
Cheng, K. Y., Luo, S. J, Chen, B. Y., and Chu, H. H. SmartPlayer: User-Centric Video Fast-Forwarding. In Proc. CHI'09, 2009, pp. 789--798.
[2]
Diarization Error Rate. http://www.xavieranguera.com/phdthesis/node108.html.
[3]
Hidden Markov Model Toolkit. http://htk.eng.cam.ac.uk
[4]
Kotti, M., Moschou, V. and Kotropoulos C. Review: Speaker segmentation and clustering. Signal Processing 88, 5 (2008), 1091--1124.
[5]
Kurihara, K. CinemaGazer: A System for Watching Videos at Very High Speed. In Proc. of AVI'12, pp.108--115, 2012.
[6]
Peker, K. A., Divakaran, A. and Sun, H. Constant pace skimming and temporal sub-sampling of video using motion activity. In Proc. IEEE Int. Conf. Image Processing (ICIP), Vol. 3, 2001, pp. 414--417.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AH '14: Proceedings of the 5th Augmented Human International Conference
March 2014
249 pages
ISBN:9781450327619
DOI:10.1145/2582051
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

  • MEET IN KOBE 21st Century: MEET IN KOBE 21st Century

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2014

Check for updates

Author Tags

  1. GMM
  2. fast-forwarding
  3. video
  4. voice activity detection

Qualifiers

  • Research-article

Conference

AH '14
Sponsor:
  • MEET IN KOBE 21st Century

Acceptance Rates

Overall Acceptance Rate 121 of 306 submissions, 40%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 97
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media